Of : Treelets — an Adaptive Multi - Scale Basis for Sparse Unordered Data

نویسندگان

N. MEINSHAUSEN

P. BÜHLMANN

چکیده

We congratulate Lee, Nadler and Wasserman (henceforth LNW) on a very interesting paper on new methodology and supporting theory. Treelets seem to tackle two important problems of modern data analysis at once. For datasets with many variables, treelets give powerful predictions even if variables are highly correlated and redundant. Maybe more importantly, interpretation of the results is intuitive. Useful insights about relevant groups of variables can be gained. Our comments and questions include: (i) Could the success of treelets be replicated by a combination of hierarchical clustering and PCA? (ii) When choosing a suitable basis, treelets seem to be largely an unsupervised method. Could the results be even more interpretable and powerful if treelets would take into account some supervised response variable? (iii) Interpretability of the result hinges on the sparsity of the final basis. Do we expect that the selected groups of variables will always be sufficiently small to be amenable for inter-pretation? 1. Treelets or hierarchical clustering combined with PCA. A main part of the treelet algorithm achieves two main objectives: (1) Variables are ordered in a hierarchical scheme. Highly correlated variables are typically " close " in the hierarchy. (2) A basis on the tree is chosen. Each node of the tree is associated with a " sum " (and also a " difference " variable). Clearly, treelets are more elegant than any method trying to achieve these two goals separately. As LNW write in Section 1: " The novelty and contribution of our approach is the simultaneous construction of a data-driven multi-scale orthogonal basis and a hierarchical cluster tree. " We are left wondering, though, how different treelets are to the following scheme. First, variables are ordered in a hierarchical clustering scheme—for concreteness,

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Treelets—an Adaptive Multi-scale Basis for Sparse Unordered Data by Ann

In many modern applications, including analysis of gene expression and text documents, the data are noisy, high-dimensional, and unordered—with no particular meaning to the given order of the variables. Yet, successful learning is often possible due to sparsity: the fact that the data are typically redundant with underlying structures that can be represented by only a few features. In this pape...

متن کامل

ar X iv : 0 70 7 . 04 81 v 2 [ st at . M E ] 3 1 A ug 2 00 7 Treelets — An Adaptive Multi - Scale Basis for Sparse Unordered Data

In many modern applications, including analysis of gene expression and text documents, the data are noisy, high-dimensional, and unordered — with no particular meaning to the given order of the variables. Yet, successful learning is often possible due to sparsity: the fact that the data are typically redundant with underlying structures that can be represented by only a few features. In this pa...

متن کامل

ar X iv : 0 70 7 . 04 81 v 1 [ st at . M E ] 3 J ul 2 00 7 Treelets — An Adaptive Multi - Scale Basis for Sparse Unordered Data

In many modern applications, including analysis of gene expression and text documents, the data are noisy, high-dimensional, and unordered — with no particular meaning to the given order of the variables. Yet, successful learning is often possible due to sparsity; the fact that the data are typically redundant with underlying structures that can be represented by only a few features. In this pa...

متن کامل

Discussion Of: Treelets-an Adaptive Multi-scale Basis for Sparse Unordered Data.

We would like to congratulate Lee, Nadler and Wasserman on their contribution to clustering and data reduction methods for high p and low n situations. A composite of clustering and traditional principal components analysis, treelets is an innovative method for multi-resolution analysis of unordered data. It is an improvement over traditional PCA and an important contribution to clustering meth...

متن کامل

Discussion Of: Treelets—an Adaptive Multi-scale Basis for Sparse Unordered Data by Nicolai Meinshausen

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Of : Treelets — an Adaptive Multi - Scale Basis for Sparse Unordered Data

نویسندگان

چکیده

منابع مشابه

Treelets—an Adaptive Multi-scale Basis for Sparse Unordered Data by Ann

ar X iv : 0 70 7 . 04 81 v 2 [ st at . M E ] 3 1 A ug 2 00 7 Treelets — An Adaptive Multi - Scale Basis for Sparse Unordered Data

ar X iv : 0 70 7 . 04 81 v 1 [ st at . M E ] 3 J ul 2 00 7 Treelets — An Adaptive Multi - Scale Basis for Sparse Unordered Data

Discussion Of: Treelets-an Adaptive Multi-scale Basis for Sparse Unordered Data.

Discussion Of: Treelets—an Adaptive Multi-scale Basis for Sparse Unordered Data by Nicolai Meinshausen

عنوان ژورنال:

اشتراک گذاری